7 research outputs found

    Case for holistic query evaluation

    Get PDF
    In this thesis we present the holistic query evaluation model. We propose a novel query engine design that exploits the characteristics of modern processors when queries execute inside main memory. The holistic model (a) is based on template-based code generation for each executed query, (b) uses multithreading to adapt to multicore processor architectures and (c) addresses the optimization problem of scheduling multiple threads for intra-query parallelism. Main-memory query execution is a usual operation in modern database servers equipped with tens or hundreds of gigabytes of RAM. In such an execution environment, the query engine needs to adapt to the CPU characteristics to boost performance. For this purpose, holistic query evaluation applies customized code generation to database query evaluation. The idea is to use a collection of highly efficient code templates and dynamically instantiate them to create query- and hardware-specific source code. The source code is compiled and dynamically linked to the database server for processing. Code generation diminishes the bloat of higher-level programming abstractions necessary for implementing generic, interpreted, SQL query engines. At the same time, the generated code is customized for the hardware it will run on. The holistic model supports the most frequently used query processing algorithms, namely sorting, partitioning, join evaluation, and aggregation, thus allowing the efficient evaluation of complex DSS or OLAP queries. Modern CPUs follow multicore designs with multiple threads running in parallel. The dataflow of query engine algorithms needs to be adapted to exploit such designs. We identify memory accesses and thread synchronization as the main bottlenecks in a multicore execution environment. We extend the holistic query evaluation model and propose techniques to mitigate the impact of these bottlenecks on multithreaded query evaluation. We analytically model the expected performance and scalability of the proposed algorithms according to the hardware specifications. The analytical performance expressions can be used by the optimizer to statically estimate the speedup of multithreaded query execution. Finally, we examine the problem of thread scheduling in the context of multithreaded query evaluation on multicore CPUs. The search space for possible operator execution schedules scales fast, thus forbidding the use of exhaustive techniques. We model intra-query parallelism on multicore systems and present scheduling heuristics that result in different degrees of schedule quality and optimization cost. We identify cases where each of our proposed algorithms, or combinations of them, are expected to generate schedules of high quality at an acceptable running cost

    Case for holistic query evaluation

    No full text
    In this thesis we present the holistic query evaluation model. We propose a novel query engine design that exploits the characteristics of modern processors when queries execute inside main memory. The holistic model (a) is based on template-based code generation for each executed query, (b) uses multithreading to adapt to multicore processor architectures and (c) addresses the optimization problem of scheduling multiple threads for intra-query parallelism. Main-memory query execution is a usual operation in modern database servers equipped with tens or hundreds of gigabytes of RAM. In such an execution environment, the query engine needs to adapt to the CPU characteristics to boost performance. For this purpose, holistic query evaluation applies customized code generation to database query evaluation. The idea is to use a collection of highly efficient code templates and dynamically instantiate them to create query- and hardware-specific source code. The source code is compiled and dynamically linked to the database server for processing. Code generation diminishes the bloat of higher-level programming abstractions necessary for implementing generic, interpreted, SQL query engines. At the same time, the generated code is customized for the hardware it will run on. The holistic model supports the most frequently used query processing algorithms, namely sorting, partitioning, join evaluation, and aggregation, thus allowing the efficient evaluation of complex DSS or OLAP queries. Modern CPUs follow multicore designs with multiple threads running in parallel. The dataflow of query engine algorithms needs to be adapted to exploit such designs. We identify memory accesses and thread synchronization as the main bottlenecks in a multicore execution environment. We extend the holistic query evaluation model and propose techniques to mitigate the impact of these bottlenecks on multithreaded query evaluation. We analytically model the expected performance and scalability of the proposed algorithms according to the hardware specifications. The analytical performance expressions can be used by the optimizer to statically estimate the speedup of multithreaded query execution. Finally, we examine the problem of thread scheduling in the context of multithreaded query evaluation on multicore CPUs. The search space for possible operator execution schedules scales fast, thus forbidding the use of exhaustive techniques. We model intra-query parallelism on multicore systems and present scheduling heuristics that result in different degrees of schedule quality and optimization cost. We identify cases where each of our proposed algorithms, or combinations of them, are expected to generate schedules of high quality at an acceptable running cost.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Strongly consistent replication for a bargain

    No full text
    Abstract — Strong consistency is an important correctness property for replicated databases. It ensures that each transaction accesses the latest committed database state as provided in centralized databases. Achieving strong consistency in replicated databases is a major performance challenge and is typically not provided, exposing inconsistent data to client applications. We propose two scalable techniques that exploit lazy update propagation and workload information to guarantee strong consistency by delaying transaction start. We implement a prototype replicated database system and incorporate the proposed techniques for providing strong consistency. Extensive experiments using both a micro-benchmark and the TPC-W benchmark demonstrate that our proposals are viable and achieve considerable scalability while maintaining strong consistency. I
    corecore